My early tests have shown differences in population mean distributions (being normal or not) due to differences in linkage and deleterious mutation frequency (relative to neutral mutations). These tests extend this by including the strength of genetic drift as a treatment, mediated by a population size of N = 10, 100, or 1000 (very strong, intermediate, weaker drift, respectively).
The SLiM script follows the neutral evolution of 1 trait under the control of ten gene regions where phenotype-affecting neutral mutations can appear. Trait-independent neutral and deleterious can also appear, with the rate of deleterious mutations being varied in script between 1 of 3 values: 0.0 (no deleterious mutations), 0.1 (1 in 10 mutations are deleterious), and 1.0 (equal chance deleterious or neutral). The linkage between these gene regions is also altered in script via recombination rate between regions, which is the recombination rate between the end base position of one region and the start of the next (1bp); it can be 0 (complete linkage), 1e-8 (intermediate linkage), or 0.5 (complete independence). The population size was either 10, 100, or 1000, with larger values weakening the effects of drift. It is expected that with population sizes of 10 the strong effects of drift will increase the variance of runs, and therefore exacerbate the effects of deleterious mutation and linkage on affecting the distribution of means taken at generation 20000.
The overall recombination rate across the whole genome (apart of course from the between-region rate explained in the treatment above) is 1e-08, SLiM’s default. The mutation rate is 3.24e-09, an average over several recombination rates for multiple taxa.
# Load data
# Create an empty matrix (shapmat) for the loop
shapmat <- matrix(nrow=length(unique(slim_out_multiNRD_final$delmu)), ncol=length(unique(slim_out_multiNRD_final$r)))
rownames(shapmat) <- c("Del = 0", "Del = 0.1", "Del = 1")
colnames(shapmat) <-c("r = 0", "r = 1e-08", "r = 0.5")
#Create a shapmat for each population size
for (i in 1:length(unique(slim_out_multiNRD$N))) {
shapmat_temp <- matrix(nrow=length(unique(slim_out_multiNRD_final$delmu)), ncol=length(unique(slim_out_multiNRD_final$r)))
rownames(shapmat_temp) <- c("Del = 0", "Del = 0.1", "Del = 1")
colnames(shapmat_temp) <-c("r = 0", "r = 1e-08", "r = 0.5")
assign(paste0("shapmat_N",i), shapmat_temp)
}
In this code we create a matrix to store the p values for each set of treatments with a certain population size. We can then plot this
library(ggplot2)
# Set up some labels for the plot
N.labs <- c("N = 10", "N = 100", "N = 1000")
names(N.labs) <- c("10", "100", "1000")
ggplot(dat_shap_NRD, aes(x = as.factor(delmu), y = pvalue, fill = as.factor(r))) +
geom_col(position = "dodge") +
facet_grid(N ~ ., scale = "free", labeller = labeller( N = N.labs)) +
labs(fill = "Recombination\nbetween\nregions", x = "Prevalence of deleterious mutations", y = "p-value") +
geom_hline (yintercept = 0.05, linetype = "dashed") + # Add a dashed line to indicate significance i.e. non-normality
theme_classic()
This figure shows us that with small populations under lots of drift, the p-value of a Shapiro test is reduced - even with complete linkage and no deleterious mutations (r = 0, delmu = 0), the p-value is under 0.25. With other population sizes this is larger. In these conditions, it is most expected for the population mean to be normally distributed, hence the high p-value, since there are no deleterious mutations to randomly take out additive effects in some runs, skewing the distribution of means. In addition, because everything is linked, all the additive effects (coming from a normal distribution) are always collected together. Yet with a small population size, the p-value reduces, and even becomes significant (non-normality) with complete independence of loci. This is because the random effects of drift to lose or fix individual effects are much stronger, so each run is skewed more to a certain direction due to the increased intensity of drift.
With lots of deleterious mutation (delmu = 1), the effects become strange with differing population size. When population size is small, complete linkage or intermediate results in non-normality. This could be because deleterious mutations are linked to the additive effects in this case, so both this and the effects of strong drift act in tandem to bias the phenotype in one direction or the other.
These p-values for normality can be compared with the density curves of each treatment combination, shown below
The next figures are examples of the Brownian walks of the model under different parameters